随机偏微分方程(SPDES)是在随机性影响下模拟动态系统的选择的数学工具。通过将搜索SPDE的温和解决方案作为神经定点问题,我们介绍了神经SPDE模型,以便从部分观察到的数据中使用(可能随机)的PDE溶液运营商。我们的模型为两类物理启发神经架构提供了扩展。一方面,它延伸了神经CDES,SDES,RDE - RNN的连续时间类似物,因为即使当后者在无限尺寸状态空间中演变时,它也能够处理进入的顺序信息。另一方面,它扩展了神经运营商 - 神经网络的概括到函数空间之间的模型映射 - 因为它可以用于学习解决方案运算符$(U_0,\ xi)\ MapSto U $同时上的SPDES初始条件$ u_0 $和驾驶噪声$ \ xi $的实现。神经SPDE是不变的,它可以使用基于记忆有效的隐式分化的反向化的训练,并且一旦接受训练,其评估比传统求解器快3个数量级。在包括2D随机Navier-Stokes方程的各种半线性SPDES的实验证明了神经间隙如何能够以更好的准确性学习复杂的时空动态,并仅使用适度的培训数据与所有替代模型相比。
translated by 谷歌翻译
随机过程是随机变量,其中一些路径中的值。然而,将随机过程降低到路径值随机变量忽略其过滤,即通过时间通过该过程携带的信息流。通过调节其过滤过程,我们介绍了一系列高阶内核eMbeddings(KMES),概括了KME的概念,并捕获了与过滤有关的附加信息。我们导出了相关的高阶最大均衡(MMD)的经验估计器,并证明了一致性。然后,我们构建一个过滤敏感的内核两种样本测试,能够拾取标准MMD测试错过的信息。此外,利用我们的更高阶MMDS,我们在随机过程中构建了一个通用内核的家庭,允许通过经典内核的回归方法解决现实世界校准和最佳停止问题(例如美国选项的定价)。最后,调整对随机过程的情况的条件独立性的现有测试,我们设计了一种因果发现算法,以恢复与其多维轨迹的观察相互作用的结构依赖性的因果关系。
translated by 谷歌翻译
Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.
translated by 谷歌翻译
Diffusion models have shown a great ability at bridging the performance gap between predictive and generative approaches for speech enhancement. We have shown that they may even outperform their predictive counterparts for non-additive corruption types or when they are evaluated on mismatched conditions. However, diffusion models suffer from a high computational burden, mainly as they require to run a neural network for each reverse diffusion step, whereas predictive approaches only require one pass. As diffusion models are generative approaches they may also produce vocalizing and breathing artifacts in adverse conditions. In comparison, in such difficult scenarios, predictive models typically do not produce such artifacts but tend to distort the target speech instead, thereby degrading the speech quality. In this work, we present a stochastic regeneration approach where an estimate given by a predictive model is provided as a guide for further diffusion. We show that the proposed approach uses the predictive model to remove the vocalizing and breathing artifacts while producing very high quality samples thanks to the diffusion model, even in adverse conditions. We further show that this approach enables to use lighter sampling schemes with fewer diffusion steps without sacrificing quality, thus lifting the computational burden by an order of magnitude. Source code and audio examples are available online (https://uhh.de/inf-sp-storm).
translated by 谷歌翻译
Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech enhancement and dereverberation. While discriminative models have traditionally been argued to be more powerful e.g. for speech enhancement, generative diffusion approaches have recently been shown to narrow this performance gap considerably. In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks. For this, we extend our prior contributions on diffusion-based speech enhancement in the complex time-frequency domain to the task of bandwith extension. We then compare it to a discriminatively trained neural network with the same network architecture on three restoration tasks, namely speech denoising, dereverberation and bandwidth extension. We observe that the generative approach performs globally better than its discriminative counterpart on all tasks, with the strongest benefit for non-additive distortion models, like in dereverberation and bandwidth extension. Code and audio examples can be found online at https://uhh.de/inf-sp-sgmsemultitask
translated by 谷歌翻译
最近,基于扩散的生成模型已引入语音增强的任务。干净的语音损坏被建模为固定的远期过程,其中逐渐添加了越来越多的噪声。通过学习以嘈杂的输入为条件的迭代方式扭转这一过程,可以产生干净的语音。我们以先前的工作为基础,并在随机微分方程的形式主义中得出训练任务。我们对基础分数匹配目标进行了详细的理论综述,并探索了不同的采样器配置,以解决测试时的反向过程。通过使用自然图像生成文献的复杂网络体系结构,与以前的出版物相比,我们可以显着提高性能。我们还表明,我们可以与最近的判别模型竞争,并在评估与培训不同的语料库时获得更好的概括。我们通过主观的听力测试对评估结果进行补充,其中我们提出的方法是最好的。此外,我们表明所提出的方法在单渠道语音覆盖中实现了出色的最新性能。我们的代码和音频示例可在线获得,请参见https://uhh.de/inf-sp-sgmse
translated by 谷歌翻译
在本文中,提出了一种用于加权预测误差(WPE)方法的Kalman滤波变体的神经网络增强算法。滤波器随机变化是通过使用过滤器残留误差和信号特性端对端的深神经网络(DNN)预测的。提出的框架允许在类似于Whamr!的单渠道嘈杂的混响数据集上进行稳健的编织。当目标语音功率频谱密度不完全了解并且观察值嘈杂时,Kalman过滤WPE仅预测剩余误差的滤波器变化时,才会在增强信号中引入失真。提出的方法通过以数据驱动的方式纠正滤波器变化估计来避免这些扭曲,从而将方法的鲁棒性增加到噪声方案。此外,与DNN支持的递归最小二乘正方形变体相比,它产生了强烈的脊椎和脱氧性能,尤其是对于高度嘈杂的输入。
translated by 谷歌翻译